Effect of audio-visual asynchrony between time-expanded speech and a moving image of a talker2s face on detection and tolerance thresholds

نویسندگان

  • Shuichi Sakamoto
  • Akihiro Tanaka
  • Shun Numahata
  • Atsushi Imai
  • Tohru Takagi
  • Yôiti Suzuki
چکیده

In this study, we measured detection and tolerance thresholds of auditory-visual asynchrony between time-expanded speech and a moving image of the talker’s face. During experiments, words were presented under two conditions: asynchrony by time-expanded speech (expansion condition: EXP) and simple timing shift (asynchronous condition: ASYN). We used 16 Japanese shorter words (four morae) and 20 Japanese longer words (seven or eight morae). All auditory speech was presented in pink noise to avoid the ceiling effect. The SNRs for shorter and longer words were respectively set to −10 dB and −3.5 dB. For EXP, auditory speech signals were analyzed and resynthesized using STRAIGHT to change the words’ duration (Kawahara et al., 1998). The resynthesized auditory signals were combined with the visual signals so that the onset of the utterance was synchronous. For ASYN, the auditory speech signal was simply lagged behind the visual speech signal. Results showed that detection and tolerance thresholds in longer words were higher than those for shorter words. However, when the threshold was recalculated as a function of the ratio of the expansion rate to word duration, these differences were not observed. These results suggest that detection and tolerance thresholds for auditory-visual asynchrony between timeexpanded speech and a moving image of talker’s face might depend on the ratio of the expansion rate to word duration.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Aging effect on audio-visual speech asynchrony perception: comparison of time-expanded speech and a moving image of a talker2s face

In this study, we measured detection and tolerance thresholds of auditory-visual asynchrony between time-expanded speech and a moving image of the talker’s face for older adults. During experiments, words were presented under two conditions: asynchrony by time-expanded speech (expansion condition, EXP) and simple timing shift (asynchronous condition, ASYN). We used 16 Japanese shorter words (fo...

متن کامل

Effects of speech-rate conversion on asynchrony perception of audio-visual speech

Previous studies showed that the time-expanded speech signal and the cue of moving image of talker’s face improve speech intelligibility. Sakamoto et al. (2008) investigated the detection thresholds of auditory-visual asynchrony for timeexpanded speech and a moving image of the talker’s face. Their results showed that detection thresholds in longer words were higher than those for shorter words...

متن کامل

Effect of speed difference between time-expanded speech and talker2s moving image on word or sentence intelligibility

This study investigated effects, on a speech intelligibility, of asynchronicity between a speech signal and a talker’s moving image induced by time-expansion of the speech signal. First, a word intelligibility test (Exp. 1) was administered to younger listeners. Words were processed using STRAIGHT software to expand the speech signal by 0 to 400 ms. The word intelligibility test was administere...

متن کامل

Detection of auditory (cross-spectral) and auditory-visual (cross-modal) synchrony

Detection thresholds for temporal synchrony in auditory and auditory-visual sentence materials were obtained on normal-hearing subjects. For auditory conditions, thresholds were determined using an adaptive-tracking procedure to control the degree of temporal asynchrony of a narrow audio band of speech, both positive and negative in separate tracks, relative to three other narrow audio bands of...

متن کامل

Cipher text only attack on speech time scrambling systems using correction of audio spectrogram

Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008